Boosting Regression via Classiication

نویسنده

  • MASSIMO SANTINI
چکیده

Boosting strategies are methods of improving the accuracy of a prediction (a classii-cation rule) by combining many weakerr predictions, each of which is only moderately accurate. In this paper we present a concise analysis of the Freund and Shapire's AdaBoost algorithm FS97] from which we derive a new boosting strategy for the regression case which is an extension of the algorithm discussed in BCP97]. 1 Boosting classiication Classiication refers in general to the problem of predicting a label in a nite set L for each element of a set I of instances according to some relationship between instances and labels that can be thought as an (unknown) deterministic mapping I 7 ! L or as a joint probability distribution over I L; a prediction is then a mapping I 7 ! L whose error is some measure of the discrepancy between predicted and intended label (according the unknown mapping or the joint distribution). When the cardinality of L is 2, the classiication is said to be binary, otherwise is said to be multiclass. The goal of a boosting strategy is to combine a family of simple predictions in order to obtain a nal prediction with smaller error with respect to the simple ones. For computability reasons, the attention is usually restricted to a sample, i.e. a nite number of pairs .i; l/ drawn from I L according some given distribution; furthermore some generalization results based on the Vapnik-Chervonenkis theory VC71] are usually provided (see, for instance FS97]) to relate the error on the sample to the error with respect to the whole set of instances. For this reasons the following formal proofs concern a probability space h; F; Pi and a mapping c : ! f?1; C1g, where is a nite set representing the instance part of the samples and

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On Weak Base Learners for Boosting Regression and Classiication on Weak Base Learners for Boosting Regression and Classiication

The most basic property of the boosting algorithm is its ability to reduce the training error, subject to the critical assumption that the base learners generate weak hypotheses that are better that random guessing. We exploit analogies between regression and classiication to give a characterization on what base learners generate weak hypotheses, by introducing a geometric concept called the an...

متن کامل

Some Results on Weakly Accurate Base Learners for Boosting Regression and Classification

One basic property of the boosting algorithm is its ability to reduce the training error, subject to the critical assumption that the base learners generatèweak' (or more appropriately, `weakly accurate') hypotheses that are better that random guessing. We exploit analogies between regression and classiication to give a characterization on what base learners generate weak hypotheses, by introdu...

متن کامل

Large Time Behavior of Boosting

1 2 We exploit analogies between regression and classiication to study certain properties of boosting algorithms. A geometric concept called the angular span is deened and related to analogs of the VC dimension and the pseudo dimension of the regression and classiication systems, and to the assumption of the weak learner. The exponential convergence rates of boosting algorithms are shown to be ...

متن کامل

Boosting with the L 2 -loss: Regression and Classiication

This paper investigates a computationally simple variant of boosting, L 2 Boost, which is constructed from a functional gradient descent algorithm with the L 2-loss function. As other boosting algorithms, L 2 Boost uses many times in an iterative fashion a pre-chosen tting method, called the learner. Based on the explicit expression of reetting of residuals of L 2 Boost, the case with (symmetri...

متن کامل

Improving the Performance of Boosting for Naive Bayesian Classification

This paper investigates boosting naive Bayesian classiica-tion. It rst shows that boosting cannot improve the accuracy of the naive Bayesian classiier on average in a set of natural domains. By analyzing the reasons of boosting's failures, we propose to introduce tree structures into naive Bayesian classiication to improve the performance of boosting when working with naive Bayesian classiicati...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998